NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

piqtree: A Python Package for Seamless Phylogenetic Inference with IQ-TREE

https://doi.org/10.1101/2025.07.13.664626

McArthur, Robert Neil; Wong, Thomas King-Fung; Lang, Yapeng; Morris, Richard Andrew; Caley, Katherine; Mallawaarachchi, Vijini; Minh, Bui Quang; Huttley, Gavin (July 2025, bioRxiv)

piqtree is an easy to use, open-source Python package that directly exposes IQ-TREE’s phylogenetic inference engine. It offers Python functions for performing many of IQ-TREE’s capabilities including phylogenetic reconstruction, ultrafast bootstrapping, branch length optimisation, ModelFinder, rapid neighbour-joining, and more. By exposing IQ-TREE’s algorithms within Python, piqtree greatly simplifies the development of new phylogenetic workflows through seamless interoperability with other Python libraries and tools mediated by the cogent3 package. It also enables users to perform interactive analyses with IQ-TREE through, for instance, Jupyter notebooks. We present the key features available in the piqtree library and a small case study that showcases its interoperability. The piqtree library can be installed withpip install piqtree, with the documentation available at https://piqtree.readthedocs.io and source at https://github.com/iqtree/piqtree.
more » « less
Free, publicly-accessible full text available July 16, 2026
IQ-TREE 3: Phylogenomic Inference Software using Complex Evolutionary Models

https://doi.org/10.32942/X2P62N

Wong, Thomas; Ly-Trong, Nhan; Ren, Huaiyan; Baños, Hector; Roger, Andrew; Susko, Edward; Bielow, Chris; De_Maio, Nicola; Goldman, Nick; Hahn, Matthew; et al (April 2025, ecoevorxiv)

IQ-TREE (http://www.iqtree.org) is a widely used open-source software tool for efficiently inferring phylogenetic trees under maximum likelihood. Here, we present IQ-TREE version 3, the third major release of the software. IQ-TREE 3 significantly extends version 2 with new features, including mixture models as an alternative to partitioned models, gene and site concordance factors to quantify discordance between genomic regions, and a fully-featured sequence simulator. The IQ-TREE 3 source code is available at https://github.com/iqtree/iqtree3.
more » « less
Free, publicly-accessible full text available April 7, 2026
Order of amino acid recruitment into the genetic code resolved by last universal common ancestor’s protein domains

https://doi.org/10.1073/pnas.2410311121

Wehbi, Sawsan; Wheeler, Andrew; Morel, Benoit; Manepalli, Nandini; Minh, Bui Quang; Lauretta, Dante S; Masel, Joanna (December 2024, Proceedings of the National Academy of Sciences)

The current “consensus” order in which amino acids were added to the genetic code is based on potentially biased criteria, such as the absence of sulfur-containing amino acids from the Urey–Miller experiment which lacked sulfur. More broadly, abiotic abundance might not reflect biotic abundance in the organisms in which the genetic code evolved. Here, we instead identify which protein domains date to the last universal common ancestor (LUCA) and then infer the order of recruitment from deviations of their ancestrally reconstructed amino acid frequencies from the still-ancient post-LUCA controls. We find that smaller amino acids were added to the code earlier, with no additional predictive power in the previous consensus order. Metal-binding (cysteine and histidine) and sulfur-containing (cysteine and methionine) amino acids were added to the genetic code much earlier than previously thought. Methionine and histidine were added to the code earlier than expected from their molecular weights and glutamine later. Early methionine availability is compatible with inferred early use of S-adenosylmethionine and early histidine with its purine-like structure and the demand for metal binding. Even more ancient protein sequences—those that had already diversified into multiple distinct copies prior to LUCA—have significantly higher frequencies of aromatic amino acids (tryptophan, tyrosine, phenylalanine, and histidine) and lower frequencies of valine and glutamic acid than single-copy LUCA sequences. If at least some of these sequences predate the current code, then their distinct enrichment patterns provide hints about earlier, alternative genetic codes.
more » « less
Full Text Available
Updated site concordance factors minimize effects of homoplasy and taxon sampling

https://doi.org/10.1093/bioinformatics/btac741

Mo, Yu K.; Lanfear, Robert; Hahn, Matthew W.; Minh, Bui Quang; Schwartz, ed., Russell (November 2022, Bioinformatics)

Abstract MotivationSite concordance factors (sCFs) have become a widely used way to summarize discordance in phylogenomic datasets. However, the original version of sCFs was calculated by sampling a quartet of tip taxa and then applying parsimony-based criteria for discordance. This approach has the potential to be strongly affected by multiple hits at a site (homoplasy), especially when substitution rates are high or taxa are not closely related. ResultsHere, we introduce a new method for calculating sCFs. The updated version uses likelihood to generate probability distributions of ancestral states at internal nodes of the phylogeny. By sampling from the states at internal nodes adjacent to a given branch, this approach substantially reduces—but does not abolish—the effects of homoplasy and taxon sampling. Availability and implementationUpdated sCFs are implemented in IQ-TREE 2.2.2. The software is freely available at https://github.com/iqtree/iqtree2/releases. Supplementary informationSupplementary information is available at Bioinformatics online.
more » « less
New Methods to Calculate Concordance Factors for Phylogenomic Datasets

https://doi.org/10.1093/molbev/msaa106

Minh, Bui Quang; Hahn, Matthew W; Lanfear, Robert (May 2020, Molecular Biology and Evolution)
Rosenberg, Michael (Ed.)
Abstract We implement two measures for quantifying genealogical concordance in phylogenomic data sets: the gene concordance factor (gCF) and the novel site concordance factor (sCF). For every branch of a reference tree, gCF is defined as the percentage of “decisive” gene trees containing that branch. This measure is already in wide usage, but here we introduce a package that calculates it while accounting for variable taxon coverage among gene trees. sCF is a new measure defined as the percentage of decisive sites supporting a branch in the reference tree. gCF and sCF complement classical measures of branch support in phylogenetics by providing a full description of underlying disagreement among loci and sites. An easy to use implementation and tutorial is freely available in the IQ-TREE software package (http://www.iqtree.org/doc/Concordance-Factor, last accessed May 13, 2020).
more » « less
Full Text Available
Primate phylogenomics uncovers multiple rapid radiations and ancient interspecific introgression

https://doi.org/10.1371/journal.pbio.3000954

Vanderpool, Dan; Minh, Bui Quang; Lanfear, Robert; Hughes, Daniel; Murali, Shwetha; Harris, R. Alan; Raveendran, Muthuswamy; Muzny, Donna M.; Hibbins, Mark S.; Williamson, Robert J.; et al (December 2020, PLOS Biology)
Jiggins, Chris D. (Ed.)
Our understanding of the evolutionary history of primates is undergoing continual revision due to ongoing genome sequencing efforts. Bolstered by growing fossil evidence, these data have led to increased acceptance of once controversial hypotheses regarding phylogenetic relationships, hybridization and introgression, and the biogeographical history of primate groups. Among these findings is a pattern of recent introgression between species within all major primate groups examined to date, though little is known about introgression deeper in time. To address this and other phylogenetic questions, here, we present new reference genome assemblies for 3 Old World monkey (OWM) species: Colobus angolensis ssp. palliatus (the black and white colobus), Macaca nemestrina (southern pig-tailed macaque), and Mandrillus leucophaeus (the drill). We combine these data with 23 additional primate genomes to estimate both the species tree and individual gene trees using thousands of loci. While our species tree is largely consistent with previous phylogenetic hypotheses, the gene trees reveal high levels of genealogical discordance associated with multiple primate radiations. We use strongly asymmetric patterns of gene tree discordance around specific branches to identify multiple instances of introgression between ancestral primate lineages. In addition, we exploit recent fossil evidence to perform fossil-calibrated molecular dating analyses across the tree. Taken together, our genome-wide data help to resolve multiple contentious sets of relationships among primates, while also providing insight into the biological processes and technical artifacts that led to the disagreements in the first place.
more » « less
Full Text Available

Search for: All records